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ABSTRACT 

A new spectral-spatial method for hyperspectral data classifi- 
cation is proposed. For a given hyperspectral image, proba- 
bilistic pixelwise classification is first applied. Then, hierar- 
chical step-wise optimization algorithm is performed, by iter- 
atively merging neighboring regions with the smallest Dis- 
similarity Criterion (DC) and recomputing class labels for 
new regions. The DC is computed by comparing region mean 
vectors, class labels and a number of pixels in the two re- 
gions under consideration. The algorithm is converged when 
all the pixels get involved in the region merging procedure. 
Experimental results are presented on two remote sensing hy- 
perspectral images acquired by the AV1RIS and ROSIS sen- 
sors. The proposed approach improves classification accu- 
racies and provides maps with more homogeneous regions, 
when compared to previously proposed classification tech- 
niques. 

Index Terms — Hyperspectral imaging, hierarchical seg- 
mentation, classification, support vector machines. 

1. INTRODUCTION 

In hyperspectral imagery, each pixel is represented by a de- 
tailed spectrum of the received light. Since different sub- 
stances exhibit different spectral signature, hyperspectral im- 
agery is a well-suited technology for accurate image classifi- 
cation. However, a large number of spectral channels presents 
challenges to image analysis. 

An extensive literature is available on classification of hy- 
perspectral images [1, 2]. Recent studies have shown the 
advantage of considering the correlations between spatially 
adjacent pixels for accurate image classification, i.e., apply- 
ing spectral-spatial classification [3, 4]. One of the recently 
proposed approaches consists in performing image segmenta- 
tion (partitioning of the image into homogeneous regions) and 
then using the identified regions as adaptive neighborhoods 
for all the pixels within these regions [5]. However, the accu- 
racy of segmentation results strongly depends on the chosen 
criterion of region homogeneity. In order to mitigate this de- 
pendence, we have recently proposed to perform probabilis- 
tic classification for selecting the most reliably classified pix- 


els as markers, or region seeds, for region growing [4]. This 
technique led to a significant improvement of classification 
accuracies when compared to previously proposed methods. 
The drawback of this method is that the selection of markers 
strongly depends on the performance of the initial classifier: 
non-marked regions disappear in the final classification map, 
while if a marker is classified to the wrong class, the whole 
region grown from this marker risks to be wrongly classified. 

In this work, we propose to use Hierarchical Step-Wise 
Optimization (HSWO) method for including spatial depen- 
dencies into a classification procedure. HSWO is a segmen- 
tation approach, which iteratively merges pairs of the most 
similar spatially adjacent regions, and generates at its output 
a hierarchical set of image segmentations [6]. We propose to 
use supervised classification results for computing more ac- 
curately a sequence of region merges and for defining a con- 
vergence criterion, leading to a single spectral-spatial classifi- 
cation map. Thus, a new Classification and Hierarchical Op- 
timization ( CaHO ) method for hyperspectral images is pro- 
posed. First, probabilistic pixelwise classification of the input 
image is performed. Then, at each iteration two neighbor- 
ing regions with the smallest Dissimilarity Criterion (DC) are 
merged, and a class label for a new region is computed. The 
DC between regions is defined as a function of region statisti- 
cal features, a number of pixels in the considered regions and 
their class labels. When all image pixels get involved in re- 
gion merging, the algorithm converges, resulting in a spectral- 
spatial classification map. 

The paper is organized as follows. The next section 
presents a new CaHO method. Experimental results are pre- 
sented and discussed in Section 3 . Finally, conclusions are 
drawn in Section 4. 

2. PROPOSED METHOD 

On the input a 5-band hyperspectral image is given, which 
can be considered as a set of n pixel vectors X = ( Xj £ 
j = 1,2,..., n}. The objective is to compute a classifi- 
cation map L = {Lj,j = 1,2 where each pixel Xj 
is assigned to one of I\ thematic classes (i.e., has a class la- 
bel Lj). The proposed CaHO method, illustrated in Fig. 1, is 
composed of two main steps: 




gions, using an eight-connectivity neighborhood. A DC be- 
tween two regions R, and R t DCiR,. Rj) is calculated using 
the following algorithm: 


• Compute the dissimilarity measure DC spectra i ( f?, ; , Rj ) 
between two regions by comparing spectral values 
of the pixels within these regions. We investigated 
the use of two dissimilarity measures for this pur- 
pose. The Spectral Angle Mapper (SAM) between 
the region mean vectors u, = (rtji, ..., tt,s) T and 
u j = (uj i, ...,UjB) T is defined as the angle between 
them: 


SAM (u,;. u,) = arccos 


X^6=l u ib' u jb 
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The Square root of band sum Mean Squared Error 
(MSE) measure is based on minimizing the increase of 
MSE between the region mean vector and the original 
image data and is computed as 


MSE( U,;,Uj) = 
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where n, and rij is a number of pixels in the regions f?,; 
and Rj, respectively. 


Fig. 1. Flowchart of the proposed CaHO approach. “DC” 
means Dissimilarity Criterion. 

2.1. Probabilistic pixelwise classification 


• If the regions have equal class labels L{Ri) = L{R.j), 
the DC between these regions 

DC(Ri,Rj) = DC S p ec i r al (Ri , Rj ) ■ (3) 

• If the regions have different class labels L(Ri) / 
L(Rj), the DC between them is found as: 


The aim of the first step is to compute a classification map 
L = {Lj,j = 1,2, ...,n} for a given hyperspectral image, 
where each pixel has a unique class label, and class proba- 
bilities for each pixel {P(Lj = k\xj),k = 1 = 

1,2,..., n. We propose to perform probabilistic Support Vec- 
tor Machines (SVM) classification for this purpose, which is 
extremely well suited for classifying hyperspectral data [1]. 
We refer the reader to [1] and [7] for details on the SVM 
method, and to [4] for details on how class probabilities are 
estimated using pairwise coupling of binary probability esti- 
mates. 

2.2. Hierarchical optimization 

At this step, regularization of the classification map obtained 
at the previous step is performed, by applying a new hierar- 
chical optimization approach as follows: 

1) Initialize the optimization by labeling each image pixel 
as a separate region. Each one -pixel region R t has a class 
label L(Ri) and a I\ -dimensional vector of class probabilities 

{P k (Ri) = P(L(Ri) = k\Ri),k = 1, ..., K}. 

2) Compute the DC between all pairs of neighboring re- 


a) If rii > M and rij > M, DC(Ri, Rj) = oo 

(the upper maximum value of float). This means 
that if a two large regions are classified to differ- 
ent classes, they cannot be merged together. This 
condition is included for favoring merging small 
regions. 

b) Otherwise, 


DC(Ri , Rj) — W ■ DC spectra i (Ri, Rj), (4) 

where W > 1. This means that if two regions 
have different class labels, the DC between them 
is penalized by a constant W . 


3) Find the smallest DC value DC m i n . 

4) Merge all pairs of neighboring regions satisfying 
DC = DC m i n . F° r each new region R new created by 
merging two regions R, and R r recalculate: 


• Class probabilities as 

i, , i, \ _ Pk{Pi) n i + Pk(Rj)n,j 
\-^new ) — : 

""new 

k = 1 , . . . , K, where n new = rn + rij . 


(5) 





Table 1. Information Classes, Number of Labeled Samples (No. of Samp.) and Classification Accuracies in Percentage for the 
Indian Pines Image. 



| No. of Samp. | 

SVM 

ECHO 

SVM 

HSEG 

CaHO 

(W = 1.5) 

DC spectral 

Train 

Test 

MSF 

+MV 

SAM 

MSE 

Overall Accuracy 

- 

- 

78.17 

82.64 

88.41 

90.86 

88.87 

89.15 

Average Accuracy 

- 

- 

85.97 

83.75 

91.57 

93.96 

93.75 

93.82 

Corn-no till 

50 

1384 

78.18 

83.45 

90.97 

90.46 

95.38 

94.22 

Corn-min till 

50 

784 

69.64 

75.13 

69.52 

83.04 

80.36 

79.21 

Corn 

50 

184 

91.85 

92.39 

95.65 

95.65 

97.28 

96.20 

Soybeans-no till 

50 

918 

82.03 

90.10 

98.04 

92.06 

97.28 

94.99 

Soybeans-min till 

50 

2418 

58.95 

64.14 

81.97 

84.04 

73.53 

74.52 

Soybeans-clean till 

50 

564 

87.94 

89.89 

85.99 

95.39 

89.89 

94.86 

Alfalfa 

15 

39 

74.36 

48.72 

94.87 

92.31 

97.44 

94.87 

Grass/pasture 

50 

447 

92.17 

94.18 

94.63 

94.41 

93.96 

97.32 

Grass/trees 

50 

697 

91.68 

96.27 

92.40 

97.56 

97.70 

97.56 

Grass/pasture-mowed 

15 

11 

100 

36.36 

100 

100 

100 

100 

Hay-windrowed 

50 

439 

97.72 

97.72 

99.77 

99.54 

99.54 

99.32 

Oats 

15 

5 

100 

100 

100 

100 

100 

100 

Wheat 

50 

162 

98.77 

98.15 

99.38 

98.15 

99.38 

99.38 

Woods 

50 

1244 

93.01 

94.21 

97.59 

98.63 

98.63 

99.04 

Bldg-Grass-Tree-Drives 

50 

330 

61.52 

81.52 

68.79 

82.12 

79.70 

81.82 

Stone-steel towers 

50 

45 

97.78 

97.78 

95.56 

100 

100 

97.78 


• Class label as 

L(R n .ew) = arg max{Pfc (R new ) } ■ (6) 

k 

5) Stop if each image pixel has been involved at least once 
in the region merging procedure. Otherwise, recalculate the 
DC values for the new regions and all regions spatially adja- 
cent to them, and return to step 3. 

The proposed convergence criterion assumes that the im- 
age does not contain one-pixel regions of interest. If such 
regions may exist, the algorithm must be converged earlier. 
The convergence criterion in this case can for instance com- 
pare class probabilities of next candidates for merging, and 
stop the procedure when these candidates belong to different 
classes with probabilities higher than the defined threshold. 
Another, simpler criterion consists in stopping the algorithm 
when [(1 — P)n] pixels get involved in region merging, where 
P (0 < P < 1) is a probability of occurrence of one-pixel re- 
gions in the considered image. Since the images used for our 
experiments do not contain one -pixel regions of interest, we 
use the convergence criterion proposed in step 5. 

3. EXPERIMENTAL RESULTS AND DISCUSSION 

We applied the proposed CaHO method to to hyperspectral 
airborne images described in the following: 

1) The Indian Pines image was recorded by the AVIRIS 
sensor over the vegetation area. It is of 145 by 145 pixels, 
with a spatial resolution of 20 m/pixel and 200 spectral chan- 
nels. Sixteen information classes are considered, which are 
detailed in Table 1, with the number of training and test sam- 
ples for each class. Training samples were randomly selected 


Table 2. CaHO Overall and Average Classification Accura- 
cies (OA and AA, respectively) for the Indian Pines Image 
for Different Values of the Parameter W. 


DC 

W 

1.0 

1.25 

1.5 

1.75 

2.0 

3.0 

SAM 

OA 

AA 

87.34 

91.74 

88.61 

93.72 

88.87 

93.75 

88.42 

93.04 

88.44 

93.49 

86.80 

92.81 

MSE 

OA 

AA 

88.93 

87.48 

88.34 

92.65 

89.15 

93.82 

87.81 

93.23 

88.08 

93.32 

87.26 

93.27 


Table 3. Classification Accuracies in Percentage for the Cen- 
ter of Pavia Image. 



■ SVM 

SVM 

HSEG 

CaHO 

(W = 1.5) 

DC spectral 

MSF 

+MV 

SAM 

MSE 

Overall Acc. 

94.96 

91.31 

96.67 

96.58 

96.51 

Average Acc. 

92.56 

92.64 

95.41 

95.61 

95.60 


from the reference data. The remaining samples composed 
the test set. 

2) The Center of Pavia image was acquired by the ROSIS 
sensor over the urban area of Pavia, Italy. The image is of 785 
by 300 pixels, with a spatial resolution of 1.3 m/pixel, 102 
spectral channels and nine classes of interest. Thirty samples 
for each class were randomly chosen from the reference data 
as training samples. More information about the image, with 
the used training-test set can be found in [8]. 

For both images, the probabilistic one-versus-one SVM 
classification with the Gaussian Radial Basis Function (RBF) 
kernel was performed. The optimal parameters C (penalty 
during the SVM optimization) and 7 (spread of the RBF ker- 
nel) were selected by fivefold cross validation. Then, the pro- 




(a) (b) 


Fig. 2. Indian Pines image, (a) SVM classification map. (b) 
CaHO classification map (MSE DC spectra i, W = 1.5). 

posed hierarchical optimization was applied using the SAM 
and the MSE spectral dissimilarity measures (the algorithm 
was implemented using the Hierarchical Segmentation soft- 
ware [9]). We set the parameter M = 20. Table 2 gathers 
overall and average (i.e., average over the classes) accuracies 
of the CaHO method for the Indian Pines image for differ- 
ent values of the parameter W. It can be seen from the table 
that the method is robust to the choice of W, and quite a wide 
range of values of W leads to high classification accuracies 
for both SAM and MSE dissimilarity measures. The best ac- 
curacies are achieved with W = 1.5. 

Table 1 summarizes global and class-specific accuracies 
of the pixelwise SVM classification and the proposed CaHO 
technique with W = 1.5 for the Indian Pines image. In or- 
der to compare the results of the proposed method with other 
advanced techniques, we have included results of the ECHO 
classification [10], a classification using the construction of 
a minimum spanning forest from the SVM-derived markers 
(, SVMMSF ) [4] and a classification by majority voting within 
neighborhoods defined by HSEG segmentation ( HSEG+MV , 
with S wg ht = 0.0, which is equivalent to HSWO, and the 
SAM DC) [5]. Table 3 gives global accuracies of the SVM, 
SVMMSF , HSEG+MV and CaHO classification methods for 
the Center of Pavia image. As can be seen from the tables, the 
HSEG+MV and the CaHO methods yield the best global and 
most of the class-specific accuracies (the average accuracies 
of these approaches are non-significantly different). However, 
in the HSEG+MV method a segmentation map was chosen in- 
teractively from the segmentation hierarchy, while the CaHO 
method is automatic. Fig. 2(b) shows the CaHO classification 
map (with MSE D spectra i, W = 1.5), which is less noisy 
when compared to the SVM map (see Fig. 2(a)). 

4. CONCLUSIONS 

In this paper, a new CaHO method for spectral-spatial clas- 
sification of hyperspectral images is proposed. The method 
consists in performing a probabilistic pixelwise classification, 
followed by a hierarchical optimization, where at each step 


two “closest” neighboring regions are merged, and a clas- 
sification map is recomputed. Experimental results demon- 
strate that the proposed method improves classification accu- 
racies, when compared to previously proposed classification 
schemes, and is sufficiently robust for classifying different 
kinds of images. 
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